-
Notifications
You must be signed in to change notification settings - Fork 30
fix: add basic error handling to manifest-server calls #772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This CDK VersionYou can test this version of the CDK using the following: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@christo/manifest-server-errors#egg=airbyte-python-cdk[dev]' --help
# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch christo/manifest-server-errors Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
…irbytehq/airbyte-python-cdk into christo/manifest-server-errors
/autofix
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds basic error handling to the manifest-server calls to improve error reporting in the connector builder UI. Previously, errors during operations like test reads were being swallowed and returned as generic 500-level errors, making debugging difficult for customers.
Key Changes
- Added try/catch blocks around all manifest-server endpoint operations (test_read, check, discover, resolve, full_resolve)
- Standardized error responses to return HTTP 400 with descriptive error messages using AirbyteTracedException
- Added comprehensive test coverage for error handling scenarios
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
airbyte_cdk/manifest_server/routers/manifest.py | Added error handling try/catch blocks to all endpoints with descriptive error messages |
airbyte_cdk/manifest_server/api_models/manifest.py | Added ErrorResponse model for standardized error responses |
airbyte_cdk/manifest_server/api_models/init.py | Exported the new ErrorResponse model |
airbyte_cdk/manifest_server/openapi.yaml | Updated OpenAPI spec to include 400 error responses and version bump |
airbyte_cdk/manifest_server/app.py | Updated application version to 0.2.0 |
unit_tests/manifest_server/routers/test_manifest.py | Added comprehensive test coverage for error handling scenarios |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAdds a new ErrorResponse Pydantic model and exposes it via api_models; bumps FastAPI/OpenAPI version to 0.2.0; introduces 400 Bad Request responses and try/except sanitization in manifest router endpoints; and adds unit tests asserting error translation to ErrorResponse. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Client
participant Router as FastAPI Router
participant Processor as ManifestCommandProcessor / build_source
participant Sanitizer as filter_secrets
participant ErrorResp as ErrorResponse
Client->>Router: POST /v1/manifest/* (payload)
Router->>Processor: execute command / build source
alt success
Processor-->>Router: result
Router-->>Client: 200 OK (normal payload)
else exception
Processor-->>Router: raises Exception
Router->>Sanitizer: sanitize exception message
Sanitizer-->>Router: sanitized message
Router->>ErrorResp: construct {detail: sanitized message}
Router-->>Client: 400 Bad Request (ErrorResponse)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Would you like me to list the exact changed code snippets for each endpoint to speed review, wdyt? Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
…irbytehq/airbyte-python-cdk into christo/manifest-server-errors
Hi team, can we get a code review so we can ship this? This is preventing us from iterating on builder projects because errors are not exposed to the user |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking this on Christo! And apologies it's taken so long to review.
Overall makes sense - left one question about whether the wrapping in AirbyteTracedException is necessary. I think we should remove it if not necessary. I'll let you decide and merge when ready.
…irbytehq/airbyte-python-cdk into christo/manifest-server-errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
airbyte_cdk/manifest_server/routers/manifest.py (1)
186-204
: Bare re-raise is fine; add intent comment?Keeping parity with earlier feedback, would you add a brief comment clarifying the intentional bare raise? wdyt?
- except HTTPException: - # Re-raise HTTPExceptions as-is (like the catalog None check above) - raise + except HTTPException: + # Re-raise HTTPExceptions as-is (like the catalog None check above) + # Intentional bare raise: preserve original status and message + raise
🧹 Nitpick comments (7)
unit_tests/manifest_server/routers/test_manifest.py (1)
539-657
: Add a sanitization test by patching filter_secrets?To ensure secret scrubbing is exercised, could we patch the router’s filter_secrets to verify it’s invoked and controls the response? For example:
+@patch("airbyte_cdk.manifest_server.routers.manifest.filter_secrets", return_value="<<sanitized>>") +@patch("airbyte_cdk.manifest_server.routers.manifest.ManifestCommandProcessor") +@patch("airbyte_cdk.manifest_server.routers.manifest.build_source") +def test_check_sanitization( + self, mock_build_source, mock_runner_class, mock_filter, sample_manifest, sample_config, mock_source +): + mock_build_source.return_value = mock_source + mock_runner = Mock() + mock_runner.check_connection.side_effect = RuntimeError("secret-token-123") + mock_runner_class.return_value = mock_runner + response = client.post("/v1/manifest/check", json={"manifest": sample_manifest, "config": sample_config}) + assert response.status_code == 400 + assert response.json()["detail"] == "<<sanitized>>"This validates the sanitization pathway without coupling to secret registration internals, wdyt?
airbyte_cdk/manifest_server/routers/manifest.py (6)
70-76
: Document 422 validation responses too?Since Pydantic validation returns 422s, should we add 422 in responses for completeness (applies to all endpoints)? wdyt?
@router.post( "/test_read", operation_id="testRead", responses={ - 400: {"description": "Bad Request - Error processing request", "model": ErrorResponse} + 400: {"description": "Bad Request - Error processing request", "model": ErrorResponse}, + 422: {"description": "Validation error"}, }, )
121-137
: Consider logging and/or simplifying wrapping.We create AirbyteTracedException but only use its message; would it be simpler to filter_secrets(str(exc)) directly, or alternatively log the full traced error (logger.exception or error.emit_message) for observability? wdyt?
- except Exception as exc: - error = AirbyteTracedException.from_exception( - exc, message=f"Error reading stream: {str(exc)}" - ) - # Filter secrets from error message before returning to client - sanitized_message = filter_secrets(error.message or str(exc)) - raise HTTPException(status_code=400, detail=sanitized_message) + except Exception as exc: + # Optional: logger.exception("test_read failed") # keep server-side trace + sanitized_message = filter_secrets(f"Error reading stream: {str(exc)}") + raise HTTPException(status_code=400, detail=sanitized_message)If you prefer the traced exception for future use, maybe emit/log it instead, wdyt?
156-167
: Preserve existing HTTPExceptions (consistency with discover).To keep explicit HTTP statuses/messages from safe_build_source (or future checks), shall we re-raise HTTPException before the generic catch (like discover does)? wdyt?
def check(request: CheckRequest) -> CheckResponse: @@ - try: + try: source = safe_build_source(request.manifest.model_dump(), request.config.model_dump()) runner = ManifestCommandProcessor(source) success, message = runner.check_connection(request.config.model_dump()) return CheckResponse(success=success, message=message) + except HTTPException: + # Preserve explicit HTTP errors raised upstream + raise except Exception as exc: - error = AirbyteTracedException.from_exception( - exc, message=f"Error checking connection: {str(exc)}" - ) - # Filter secrets from error message before returning to client - sanitized_message = filter_secrets(error.message or str(exc)) + sanitized_message = filter_secrets(f"Error checking connection: {str(exc)}") raise HTTPException(status_code=400, detail=sanitized_message)
223-232
: Resolve: preserve HTTPExceptions like discover/check.For consistency and to avoid masking explicit statuses, shall we re-raise HTTPException here too? wdyt?
def resolve(request: ResolveRequest) -> ManifestResponse: @@ - try: + try: source = safe_build_source(request.manifest.model_dump(), {}) return ManifestResponse(manifest=Manifest(**source.resolved_manifest)) + except HTTPException: + raise except Exception as exc: - error = AirbyteTracedException.from_exception( - exc, message=f"Error resolving manifest: {str(exc)}" - ) - # Filter secrets from error message before returning to client - sanitized_message = filter_secrets(error.message or str(exc)) + sanitized_message = filter_secrets(f"Error resolving manifest: {str(exc)}") raise HTTPException(status_code=400, detail=sanitized_message)
255-280
: Avoid mutating source.resolved_manifest; preserve HTTPExceptions; tiny wording nit.full_resolve currently shallow-copies manifest and then mutates streams in-place, which can side-effect source.resolved_manifest across requests. Shall we deep copy before mutation, and also re-raise HTTPException for consistency? Also, consider “fully resolving” wording (optional) to read better. wdyt?
+import copy @@ - source = safe_build_source(request.manifest.model_dump(), request.config.model_dump()) - manifest = {**source.resolved_manifest} + source = safe_build_source(request.manifest.model_dump(), request.config.model_dump()) + manifest = copy.deepcopy(source.resolved_manifest) streams = manifest.get("streams", []) for stream in streams: stream["dynamic_stream_name"] = None @@ - return ManifestResponse(manifest=Manifest(**manifest)) + return ManifestResponse(manifest=Manifest(**manifest)) - except Exception as exc: - error = AirbyteTracedException.from_exception( - exc, message=f"Error full resolving manifest: {str(exc)}" - ) - # Filter secrets from error message before returning to client - sanitized_message = filter_secrets(error.message or str(exc)) + except HTTPException: + raise + except Exception as exc: + # Optional: change to "fully resolving" if you want improved grammar + sanitized_message = filter_secrets(f"Error full resolving manifest: {str(exc)}") raise HTTPException(status_code=400, detail=sanitized_message)This prevents cross-request mutation and keeps error handling consistent, wdyt?
17-26
: Minor: centralize exception-to-HTTP conversion?The try/except + sanitize + raise blocks repeat. Would extracting a small helper (e.g., raise_sanitized_http_400(context, exc)) reduce duplication and keep messages consistent across endpoints? wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte_cdk/manifest_server/routers/manifest.py
(7 hunks)unit_tests/manifest_server/routers/test_manifest.py
(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
unit_tests/manifest_server/routers/test_manifest.py (3)
unit_tests/manifest_server/command_processor/test_processor.py (2)
sample_config
(45-47)mock_source
(35-37)airbyte_cdk/manifest_server/routers/manifest.py (2)
test_read
(77-137)discover
(177-204)airbyte_cdk/manifest_server/command_processor/processor.py (3)
test_read
(30-57)check_connection
(59-84)discover
(86-107)
airbyte_cdk/manifest_server/routers/manifest.py (4)
airbyte_cdk/utils/airbyte_secrets_utils.py (1)
filter_secrets
(73-80)airbyte_cdk/utils/traced_exception.py (2)
AirbyteTracedException
(25-145)from_exception
(103-121)airbyte_cdk/manifest_server/api_models/manifest.py (1)
ErrorResponse
(88-91)airbyte_cdk/manifest_server/command_processor/processor.py (4)
test_read
(30-57)ManifestCommandProcessor
(23-119)check_connection
(59-84)discover
(86-107)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: Check: destination-motherduck
- GitHub Check: Check: source-pokeapi
- GitHub Check: Check: source-shopify
- GitHub Check: Check: source-intercom
- GitHub Check: Check: source-hardcoded-records
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.13, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.12, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: SDM Docker Image Build
- GitHub Check: Manifest Server Docker Image Build
🔇 Additional comments (1)
unit_tests/manifest_server/routers/test_manifest.py (1)
539-657
: Great coverage for CDK error translation to 400.Assertions validate detail shape and meaningful prefixes. Nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
airbyte_cdk/manifest_server/routers/manifest.py (1)
109-133
: Consider movingsafe_build_source
inside the try block for consistency.In
test_read
,safe_build_source
is called at line 109 (before the try block starts at line 120), while in the other endpoints (check
,discover
,resolve
,full_resolve
) it's called inside the try block. This means validation errors fromsafe_build_source
will propagate with the original "Invalid manifest" message intest_read
, but will be caught and prefixed with "Error checking connection:", etc. in other endpoints.Is this intentional to preserve the original validation error message in
test_read
, or should it be moved inside the try block for consistency, wdyt?Additionally, consider logging the original exception before sanitizing and re-raising to aid debugging:
runner = ManifestCommandProcessor(source) try: cdk_result = runner.test_read( config_dict, catalog, converted_state, request.record_limit, request.page_limit, request.slice_limit, ) return StreamReadResponse.model_validate(asdict(cdk_result)) except Exception as exc: + # Log the original exception for debugging + import logging + logging.getLogger("airbyte.manifest-server").exception("Error in test_read") # Filter secrets from error message before returning to client sanitized_message = filter_secrets(f"Error reading stream: {str(exc)}") raise HTTPException(status_code=400, detail=sanitized_message)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte_cdk/manifest_server/routers/manifest.py
(7 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte_cdk/manifest_server/routers/manifest.py (5)
airbyte_cdk/utils/airbyte_secrets_utils.py (1)
filter_secrets
(73-80)airbyte_cdk/manifest_server/api_models/manifest.py (1)
ErrorResponse
(88-91)airbyte_cdk/manifest_server/command_processor/processor.py (4)
test_read
(30-57)ManifestCommandProcessor
(23-119)check_connection
(59-84)discover
(86-107)airbyte_cdk/manifest_server/api_models/stream.py (1)
StreamReadResponse
(67-76)airbyte_cdk/sources/declarative/concurrent_declarative_source.py (4)
discover
(382-385)resolved_manifest
(350-360)streams
(388-420)dynamic_streams
(510-514)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Check: source-shopify
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.13, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.12, Ubuntu)
🔇 Additional comments (8)
airbyte_cdk/manifest_server/routers/manifest.py (8)
17-17
: LGTM!The
filter_secrets
import is properly used throughout the file to sanitize error messages before returning them to clients.
24-24
: LGTM!The
ErrorResponse
model is correctly integrated into the API responses for all endpoints.
69-75
: LGTM!The OpenAPI documentation correctly describes the 400 error response with the ErrorResponse model.
136-161
: LGTM!The try/except block correctly wraps the entire connection check logic and sanitizes error messages before returning them to clients.
188-190
: LGTM!The HTTPException re-raise pattern correctly preserves the original status code (e.g., 422 for missing catalog) while catching and sanitizing other exceptions as 400s. This is the right approach for distinguishing between different error types.
163-195
: Good pattern for preserving HTTP status codes.The discover endpoint correctly re-raises HTTPExceptions (like the 422 for missing catalog at lines 184-186) while catching and sanitizing other exceptions as 400s. This ensures that specific HTTP status codes are preserved where appropriate.
197-220
: LGTM!The error handling follows the established pattern consistently.
222-264
: LGTM!The try/except block correctly handles errors during manifest resolution and dynamic stream generation, with proper secret filtering.
What
Resolves: https://github.com/airbytehq/airbyte-internal-issues/issues/14540
Loom Demo: https://www.loom.com/share/3f063d020c774861ab0c40a10c429b11
Adds basic try/catch error handling to the manifest-server calls to ensure errors during connector builder calls are surfaced in the builder UI.
Currently, customers are experiencing issues with errors during test reads being swallowed and thrown as generic 500-level errors.
How to Test
For quick backend tests, use Postman against a locally deployed instance of the manifest-server. This was the test manifest I used in the request body to replicate the error encountered in the Swarm ticket.
On main, this will cause an unhandled exception to throw the 500 server error.
With the changes, the actual error is caught and the message displayed in the details.
To Test in Platform
Steps to run manifest-server locally and connect to local OSS Airbyte instance:
https://gist.github.com/ChristoGrab/1abaaacc0d1879d8642c946336feb5a9
Summary by CodeRabbit